Unsupervised Learning for Syntactic Disambiguation

نویسنده

Alexander Gelbukh

چکیده

We present a methodology framework for syntactic disambiguation in natural language texts. The method takes advantage of an existing manually compiled non-probabilistic and nonlexicalized grammar, and turns it into a probabilistic lexicalized grammar by automatically learning a kind of subcategorization frames or selectional preferences for all words observed in the training corpus. The dictionary of subcategorization frames or selectional preferences obtained in the training process can be subsequently used for syntactic disambiguation of new unseen texts. The learning process is unsupervised and requires no manual markup. The learning algorithm proposed in this paper can take advantage of any existing disambiguation method, including linguistically motivated methods of filtering or weighting competing alternative parse trees or syntactic relations, thus allowing for integration of linguistic knowledge and unsupervised machine learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Unsupervised Verb Class Disambiguation

We present an unsupervised learning method for disambiguating verbs that belong to more than one Levin verb class (1993) when occurring in a particular syntactic frame. We used examples that contain unambiguous verbs in each verb class as the training data for ambiguous verbs in that class. A Naive Bayesian classifier was employed for the disambiguation task using context words as features. Our...

متن کامل

Unsupervised Relation Disambiguation Using Spectral Clustering

This paper presents an unsupervised learning approach to disambiguate various relations between name entities by use of various lexical and syntactic features from the contexts. It works by calculating eigenvectors of an adjacency graph’s Laplacian to recover a submanifold of data from a high dimensionality space and then performing cluster number estimation on the eigenvectors. Experiment resu...

متن کامل

Unsupervised Learning of Syntactic Knowledge: Methods and Measures

Supervised methods for ambiguity resolution learn in "sterile" environments, in absence of syntactic noise. However, in many language engineering applications manually tagged corpora are not available nor easily implemented. On the other side, the "exportability" of disambiguation cues acquired from a given, noise-free, domain (e.g. the Wall Street Journal) to other domains is not obvious. Unsu...

متن کامل

Unsupervised Relation Disambiguation with Order Identification Capabilities

We present an unsupervised learning approach to disambiguate various relations between name entities by use of various lexical and syntactic features from the contexts. It works by calculating eigenvectors of an adjacency graph’s Laplacian to recover a submanifold of data from a high dimensionality space and then performing cluster number estimation on the eigenvectors. This method can address ...

متن کامل

A Computational Model for Chinese Syntactic Structure Induction Based on Sentence Alignment

This paper introduces an unsupervised learning framework of Chinese syntactic structure based sentences similarity. First, all sentence pairs in the Chinese sentence corpus are aligned, and each pair is partitioned into similarity segmentations and different ones which alternately occur, Then, aligned similarity segmentations or different ones are selected as potential constituent candidates ba...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Unsupervised Learning for Syntactic Disambiguation

نویسنده

چکیده

منابع مشابه

An Unsupervised Verb Class Disambiguation

Unsupervised Relation Disambiguation Using Spectral Clustering

Unsupervised Learning of Syntactic Knowledge: Methods and Measures

Unsupervised Relation Disambiguation with Order Identification Capabilities

A Computational Model for Chinese Syntactic Structure Induction Based on Sentence Alignment

عنوان ژورنال:

اشتراک گذاری